Skip to content

Conversation

@rich-iannone
Copy link
Member

This PR adds the .fmt_engineering() formatting method. Engineering notation expresses values so that they align to certain SI prefixes. Here is a table that compares select SI prefixes and their symbols to decimal and engineering-notation representations of key numbers.

import polars as pl
from great_tables import GT

prefixes_df = pl.DataFrame({
    "name": [
        "peta", "tera", "giga", "mega", "kilo",
        None,
        "milli", "micro", "nano", "pico", "femto"
    ],
    "symbol": [
        "P", "T", "G", "M", "k",
        None,
        "m", "μ", "n", "p", "f"
    ],
    "decimal": [float(10**i) for i in range(15, -18, -3)],
})

prefixes_df = prefixes_df.with_columns(
    engineering=pl.col("decimal")
)

(
    GT(prefixes_df)
    .fmt_number(columns="decimal", n_sigfig=1)
    .fmt_engineering(columns="engineering")
    .sub_missing()
)
image

Fixes: #785

@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.77%. Comparing base (c86242f) to head (0692903).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #786      +/-   ##
==========================================
+ Coverage   91.61%   91.77%   +0.15%     
==========================================
  Files          47       47              
  Lines        5773     5821      +48     
==========================================
+ Hits         5289     5342      +53     
+ Misses        484      479       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions bot temporarily deployed to pr-786 October 20, 2025 01:17 Destroyed
@github-actions github-actions bot temporarily deployed to pr-786 October 20, 2025 01:31 Destroyed
@github-actions github-actions bot temporarily deployed to pr-786 October 20, 2025 01:36 Destroyed
@rich-iannone rich-iannone marked this pull request as ready for review October 20, 2025 01:51
@rich-iannone rich-iannone requested a review from machow as a code owner October 20, 2025 01:51
@rich-iannone rich-iannone changed the title Feat fmt engineering feat: add fmt_engineering() Oct 20, 2025
@machow
Copy link
Collaborator

machow commented Jan 12, 2026

Code review (updated)

MICHAEL NOTES:

  • remove _value_to_engineering_notation() dead code?
  • investigate formatting / linter discrepancies (do you have different version of ruff, or are we not pinning etc..?)
  • test case having lots of inputs worth investigating (e.g. maybe break up / test 1 input at a time / etc..)

Found 7 issues, ordered by confidence:

High confidence (80+):

  1. Unused sep_mark parameter (confidence: 100) - The parameter is documented to format digits like "1,000", but the implementation hardcodes use_seps=False which prevents any digit separation. This same pattern was previously fixed in fmt_scientific().

drop_trailing_zeros=drop_trailing_zeros,
drop_trailing_dec_mark=drop_trailing_dec_mark,
use_seps=False,
sep_mark=sep_mark,
dec_mark=dec_mark,

  1. PR needs rebase onto main (confidence: 100) - The branch predates PR feat: support polars expressions in vals functions #793 which added Polars expression support via @expressive decorator. Merging as-is will remove this functionality from all val_fmt_* functions. After rebasing, val_fmt_engineering() should also be decorated with @expressive.

X: TypeAlias = "Any | list[Any] | SeriesLike"

Medium confidence (50-79):

  1. Redundant local import (confidence: 50) - fmt_engineering_context imports math locally at line 1077 when it's already imported at module level. A linter would catch this.

x: float | None,

  1. Doesn't use existing helper (confidence: 50) - The PR implements custom engineering notation logic instead of using/enhancing _value_to_engineering_notation. However, the existing helper lacks features like decimals parameter, so there's a valid reason.

# Scale `x` value by a defined `scale_by` value
x = x * scale_by
# Determine whether the value is positive
is_positive = _has_positive_value(value=x)

Lower confidence (< 50):

  1. Import style refactor mixed with feature (confidence: 25) - vals.py imports were reorganized from grouped to individual statements. Minor style change, not a sweeping refactor.

from ._formats_vals import (
val_fmt_bytes as fmt_bytes,
)
from ._formats_vals import (
val_fmt_currency as fmt_currency,
)
from ._formats_vals import (
val_fmt_date as fmt_date,
)
from ._formats_vals import (
val_fmt_engineering as fmt_engineering,
)
from ._formats_vals import (
val_fmt_image as fmt_image,
)
from ._formats_vals import (
val_fmt_integer as fmt_integer,
)
from ._formats_vals import (
val_fmt_markdown as fmt_markdown,
)
from ._formats_vals import (
val_fmt_number as fmt_number,
)
from ._formats_vals import (
val_fmt_percent as fmt_percent,
)
from ._formats_vals import (
val_fmt_roman as fmt_roman,
)
from ._formats_vals import (
val_fmt_scientific as fmt_scientific,
)
from ._formats_vals import (
val_fmt_time as fmt_time,
)

  1. Test cases use many inputs (confidence: 25) - First test case uses 13 values when fewer would suffice. However, the "test one behavior" pattern isn't in the main CLAUDE.md.

(
dict(decimals=2),
[
829300232923103939802.4,
492032183020.5,
84930284002.1,
203820929.2,
84729202.4,
2323435.1,
230323.4,
50000.01,
1000.001,
10.00001,
1.2345,
0.12345,
0.0000123456,
],
[
"829.30 × 10<sup style='font-size: 65%;'>18</sup>",
"492.03 × 10<sup style='font-size: 65%;'>9</sup>",
"84.93 × 10<sup style='font-size: 65%;'>9</sup>",
"203.82 × 10<sup style='font-size: 65%;'>6</sup>",
"84.73 × 10<sup style='font-size: 65%;'>6</sup>",
"2.32 × 10<sup style='font-size: 65%;'>6</sup>",
"230.32 × 10<sup style='font-size: 65%;'>3</sup>",
"50.00 × 10<sup style='font-size: 65%;'>3</sup>",
"1.00 × 10<sup style='font-size: 65%;'>3</sup>",
"10.00",
"1.23",
"123.45 × 10<sup style='font-size: 65%;'>−3</sup>",
"12.35 × 10<sup style='font-size: 65%;'>−6</sup>",
],

  1. Redundant test data across cases (confidence: 25) - Multiple test cases for force_sign_m, force_sign_n use identical 7-value input lists when 2-3 values would demonstrate each feature.

dict(decimals=2, force_sign_m=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"+75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"+82.79 × 10<sup style='font-size: 65%;'>3</sup>",
"+716.00 × 10<sup style='font-size: 65%;'>12</sup>",
],
),
(
dict(decimals=2, force_sign_n=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>+12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>+3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"82.79 × 10<sup style='font-size: 65%;'>+3</sup>",
"716.00 × 10<sup style='font-size: 65%;'>+12</sup>",
],
),
(
dict(decimals=2, force_sign_m=True, force_sign_n=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>+12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>+3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"+75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"+82.79 × 10<sup style='font-size: 65%;'>+3</sup>",
"+716.00 × 10<sup style='font-size: 65%;'>+12</sup>",
],


Generated with Claude Code

If this review was useful, please react with 👍. Otherwise, react with 👎.

@github-actions github-actions bot temporarily deployed to pr-786 January 15, 2026 15:01 Destroyed
@rich-iannone
Copy link
Member Author

rich-iannone commented Jan 15, 2026

Code review (updated)

MICHAEL NOTES:

  • remove _value_to_engineering_notation() dead code?

Removed the unused _value_to_engineering_notation() helper function.

  • investigate formatting / linter discrepancies (do you have different version of ruff, or are we not pinning etc..?)

The import style changes in vals.py were caused by auto-formatting on save. I added an exclusion rule to .vscode/settings.json to circumvent this.

  • test case having lots of inputs worth investigating (e.g. maybe break up / test 1 input at a time / etc..)

Done. Simplified test cases significantly. Reduced from ~70 test values down to ~30 while maintaining coverage of key behaviors (positive/negative values, extreme magnitudes, zero handling, all exp_styles, force_sign options, etc.).

Found 7 issues, ordered by confidence:

High confidence (80+):

  1. Unused sep_mark parameter (confidence: 100) - The parameter is documented to format digits like "1,000", but the implementation hardcodes use_seps=False which prevents any digit separation. This same pattern was previously fixed in fmt_scientific().

drop_trailing_zeros=drop_trailing_zeros,
drop_trailing_dec_mark=drop_trailing_dec_mark,
use_seps=False,
sep_mark=sep_mark,
dec_mark=dec_mark,

Fixed. I completely removed the sep_mark= parameter from both the fmt_engineering() and val_fmt_engineering() function signatures, docstrings, and internal code.

For engineering notation, the mantissa ranges from 1-999, so digit grouping separators have no practical effect. The sep_mark= parameter could be used for digit separators in very large exponents (but such usage is rare and outside formatting limits anyway).

Note that fmt_scientific() currently has the same issue (has a sep_mark= parameter in the signature but doesn't use it). This can be addressed in a follow-up PR.

  1. PR needs rebase onto main (confidence: 100) - The branch predates PR feat: support polars expressions in vals functions #793 which added Polars expression support via @expressive decorator. Merging as-is will remove this functionality from all val_fmt_* functions. After rebasing, val_fmt_engineering() should also be decorated with @expressive.

X: TypeAlias = "Any | list[Any] | SeriesLike"

Done. Branch has been rebased onto main and val_fmt_engineering() now has the @expressive decorator to support Polars expressions.

Medium confidence (50-79):

  1. Redundant local import (confidence: 50) - fmt_engineering_context imports math locally at line 1077 when it's already imported at module level. A linter would catch this.

x: float | None,

Fixed. Removed the local import math line since math is already imported at the module level.

  1. Doesn't use existing helper (confidence: 50) - The PR implements custom engineering notation logic instead of using/enhancing _value_to_engineering_notation. However, the existing helper lacks features like decimals parameter, so there's a valid reason.

# Scale `x` value by a defined `scale_by` value
x = x * scale_by
# Determine whether the value is positive
is_positive = _has_positive_value(value=x)

The existing _value_to_engineering_notation() helper was limited as it lacked support for decimals and other options. The new implementation uses _value_to_decimal_notation() which provides all the necessary formatting options. The old unused helper has been removed.

Lower confidence (< 50):

  1. Import style refactor mixed with feature (confidence: 25) - vals.py imports were reorganized from grouped to individual statements. Minor style change, not a sweeping refactor.

from ._formats_vals import (
val_fmt_bytes as fmt_bytes,
)
from ._formats_vals import (
val_fmt_currency as fmt_currency,
)
from ._formats_vals import (
val_fmt_date as fmt_date,
)
from ._formats_vals import (
val_fmt_engineering as fmt_engineering,
)
from ._formats_vals import (
val_fmt_image as fmt_image,
)
from ._formats_vals import (
val_fmt_integer as fmt_integer,
)
from ._formats_vals import (
val_fmt_markdown as fmt_markdown,
)
from ._formats_vals import (
val_fmt_number as fmt_number,
)
from ._formats_vals import (
val_fmt_percent as fmt_percent,
)
from ._formats_vals import (
val_fmt_roman as fmt_roman,
)
from ._formats_vals import (
val_fmt_scientific as fmt_scientific,
)
from ._formats_vals import (
val_fmt_time as fmt_time,
)

Fixed. Reverted to the grouped import style matching main branch.

  1. Test cases use many inputs (confidence: 25) - First test case uses 13 values when fewer would suffice. However, the "test one behavior" pattern isn't in the main CLAUDE.md.

(
dict(decimals=2),
[
829300232923103939802.4,
492032183020.5,
84930284002.1,
203820929.2,
84729202.4,
2323435.1,
230323.4,
50000.01,
1000.001,
10.00001,
1.2345,
0.12345,
0.0000123456,
],
[
"829.30 × 10<sup style='font-size: 65%;'>18</sup>",
"492.03 × 10<sup style='font-size: 65%;'>9</sup>",
"84.93 × 10<sup style='font-size: 65%;'>9</sup>",
"203.82 × 10<sup style='font-size: 65%;'>6</sup>",
"84.73 × 10<sup style='font-size: 65%;'>6</sup>",
"2.32 × 10<sup style='font-size: 65%;'>6</sup>",
"230.32 × 10<sup style='font-size: 65%;'>3</sup>",
"50.00 × 10<sup style='font-size: 65%;'>3</sup>",
"1.00 × 10<sup style='font-size: 65%;'>3</sup>",
"10.00",
"1.23",
"123.45 × 10<sup style='font-size: 65%;'>−3</sup>",
"12.35 × 10<sup style='font-size: 65%;'>−6</sup>",
],

Addressed. Reduced test inputs to minimum needed to demonstrate each feature while still covering key edge cases (boundary values, positive/negative, zero, extreme magnitudes).

  1. Redundant test data across cases (confidence: 25) - Multiple test cases for force_sign_m, force_sign_n use identical 7-value input lists when 2-3 values would demonstrate each feature.

dict(decimals=2, force_sign_m=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"+75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"+82.79 × 10<sup style='font-size: 65%;'>3</sup>",
"+716.00 × 10<sup style='font-size: 65%;'>12</sup>",
],
),
(
dict(decimals=2, force_sign_n=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>+12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>+3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"82.79 × 10<sup style='font-size: 65%;'>+3</sup>",
"716.00 × 10<sup style='font-size: 65%;'>+12</sup>",
],
),
(
dict(decimals=2, force_sign_m=True, force_sign_n=True),
[-3.49e13, -3453, -0.000234, 0, 0.00007534, 82794, 7.16e14],
[
"−34.90 × 10<sup style='font-size: 65%;'>+12</sup>",
"−3.45 × 10<sup style='font-size: 65%;'>+3</sup>",
"−234.00 × 10<sup style='font-size: 65%;'>−6</sup>",
"0.00",
"+75.34 × 10<sup style='font-size: 65%;'>−6</sup>",
"+82.79 × 10<sup style='font-size: 65%;'>+3</sup>",
"+716.00 × 10<sup style='font-size: 65%;'>+12</sup>",
],

Addressed. Reduced force_sign_m, force_sign_n tests from 7 values to 2-3 values each, covering positive, negative, and zero cases.

@github-actions github-actions bot temporarily deployed to pr-786 January 15, 2026 15:42 Destroyed
@github-actions github-actions bot temporarily deployed to pr-786 January 15, 2026 16:00 Destroyed
Copy link
Collaborator

@machow machow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this looks great! The one final thing I might suggest from reading the docstring

image

I wonder if it'd be helpful to frontload an example, like...

With numeric values in a table, we can perform formatting so that the targeted values are rendered in engineering notation. For example, the number 0.0000345 in engineering notation can be 34.50 x 10^-6. Engineering notation represents numbers as a mantissa (m) and an exponent (n), in the form m x 10^n or mEn. ...

Essentialy, moving the example from the end to the front, so it becomes a worked example

@rich-iannone rich-iannone merged commit 6b224c0 into main Jan 16, 2026
14 checks passed
@rich-iannone rich-iannone deleted the feat-fmt-engineering branch January 16, 2026 14:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add the fmt_engineering() formatting method.

3 participants